Image Captioning with Context-Aware Auxiliary Guidance
نویسندگان
چکیده
Image captioning is a challenging computer vision task, which aims to generate natural language description of an image. Most recent researches follow the encoder-decoder framework depends heavily on previous generated words for current prediction. Such methods can not effectively take advantage future predicted information learn complete semantics. In this paper, we propose Context-Aware Auxiliary Guidance (CAAG) mechanism that guide model perceive global contexts. Upon model, CAAG performs semantic attention selectively concentrates useful predictions reproduce generation. To validate adaptability method, apply three popular captioners and our proposal achieves competitive performance Microsoft COCO image benchmark, e.g. 132.2 CIDEr-D score Karpathy split 130.7 (c40) official online evaluation server.
منابع مشابه
Image Captioning with Attention
In the past few years, neural networks have fueled dramatic advances in image classi cation. Emboldened, researchers are looking for more challenging applications for computer vision and arti cial intelligence systems. They seek not only to assign numerical labels to input data, but to describe the world in human terms. Image and video captioning is among the most popular applications in this t...
متن کاملContext-Aware Image Compression
We describe a physics-based data compression method inspired by the photonic time stretch wherein information-rich portions of the data are dilated in a process that emulates the effect of group velocity dispersion on temporal signals. With this coding operation, the data can be downsampled at a lower rate than without it. In contrast to previous implementation of the warped stretch compression...
متن کاملImage Captioning with Sparse Lstm
Long Short-Term Memory (LSTM) is widely used to solve sequence modeling problems, for example, image captioning. We found the LSTM cells are heavily redundant. We adopt network pruning to reduce the redundancy of LSTM and introduce sparsity as new regularization to reduce overfitting. We can achieve better performance than the dense baseline while reducing the total number of parameters in LSTM...
متن کاملCorrection: Context-Aware Image Compression
[This corrects the article DOI: 10.1371/journal.pone.0158201.].
متن کاملPhrase-based Image Captioning
Generating a novel textual description of an image is an interesting problem that connects computer vision and natural language processing. In this paper, we present a simple model that is able to generate descriptive sentences given a sample image. This model has a strong focus on the syntax of the descriptions. We train a purely bilinear model that learns a metric between an image representat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i3.16361